The InterPro protein families database: the classification resource after 15 years

نویسندگان

  • Alex L. Mitchell
  • Hsin-Yu Chang
  • Louise Daugherty
  • Matthew Fraser
  • Sarah Hunter
  • Rodrigo Lopez
  • Craig McAnulla
  • Conor McMenamin
  • Gift Nuka
  • Sebastien Pesseat
  • Amaia Sangrador-Vegas
  • Maxim Scheremetjew
  • Claudia Rato
  • Siew-Yit Yong
  • Alex Bateman
  • Marco Punta
  • Terri K. Attwood
  • Christian J. A. Sigrist
  • Nicole Redaschi
  • Catherine Rivoire
  • Ioannis Xenarios
  • Daniel Kahn
  • Dominique Guyot
  • Peer Bork
  • Ivica Letunic
  • Julian Gough
  • Matt E. Oates
  • Daniel H. Haft
  • Hongzhan Huang
  • Darren A. Natale
  • Cathy H. Wu
  • Christine A. Orengo
  • Ian Sillitoe
  • Huaiyu Mi
  • Paul D. Thomas
  • Robert D. Finn
چکیده

The InterPro database (http://www.ebi.ac.uk/interpro/) is a freely available resource that can be used to classify sequences into protein families and to predict the presence of important domains and sites. Central to the InterPro database are predictive models, known as signatures, from a range of different protein family databases that have different biological focuses and use different methodological approaches to classify protein families and domains. InterPro integrates these signatures, capitalizing on the respective strengths of the individual databases, to produce a powerful protein classification resource. Here, we report on the status of InterPro as it enters its 15th year of operation, and give an overview of new developments with the database and its associated Web interfaces and software. In particular, the new domain architecture search tool is described and the process of mapping of Gene Ontology terms to InterPro is outlined. We also discuss the challenges faced by the resource given the explosive growth in sequence data in recent years. InterPro (version 48.0) contains 36,766 member database signatures integrated into 26,238 InterPro entries, an increase of over 3993 entries (5081 signatures), since 2012.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

InterPro: An Integrated Documentation Resource for Protein Families, Domains and Functional Sites

The exponential increase in the submission of nucleotide sequences to the nucleotide sequence database by genome sequencing centres has resulted in a need for rapid, automatic methods for classification of the resulting protein sequences. There are several signature and sequence cluster-based methods for protein classification, each resource having distinct areas of optimum application owing to...

متن کامل

The InterPro Database, 2003 brings increased coverage and new features

InterPro, an integrated documentation resource of protein families, domains and functional sites, was created in 1999 as a means of amalgamating the major protein signature databases into one comprehensive resource. PROSITE, Pfam, PRINTS, ProDom, SMART and TIGRFAMs have been manually integrated and curated and are available in InterPro for text- and sequence-based searching. The results are pro...

متن کامل

InterPro: Quick tour

Classifying proteins into families and identifying important domains and sites is invaluable for helping biologists to identify distantly related proteins and to predict their functions. A daunting array of resources, each with different strengths and weaknesses, is now available for searching genomes and proteomes with ‘protein signatures’ – diagnostic entities that are used to recognise a par...

متن کامل

The InterPro BioMart: federated query and web service access to the InterPro Resource

The InterPro BioMart provides users with query-optimized access to predictions of family classification, protein domains and functional sites, based on a broad spectrum of integrated computational models ('signatures') that are generated by the InterPro member databases: Gene3D, HAMAP, PANTHER, Pfam, PIRSF, PRINTS, ProDom, PROSITE, SMART, SUPERFAMILY and TIGRFAMs. These predictions are provided...

متن کامل

The InterPro database, an integrated documentation resource for protein families, domains and functional sites

Signature databases are vital tools for identifying distant relationships in novel sequences and hence for inferring protein function. InterPro is an integrated documentation resource for protein families, domains and functional sites, which amalgamates the efforts of the PROSITE, PRINTS, Pfam and ProDom database projects. Each InterPro entry includes a functional description, annotation, liter...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 43  شماره 

صفحات  -

تاریخ انتشار 2015